Overview

Dataset statistics

Number of variables15
Number of observations6081641
Missing cells3655639
Missing cells (%)4.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory696.0 MiB
Average record size in memory120.0 B

Variable types

Categorical9
Numeric6

Alerts

ride_id has a high cardinality: 6081641 distinct values High cardinality
started_at has a high cardinality: 5125092 distinct values High cardinality
ended_at has a high cardinality: 5138667 distinct values High cardinality
start_station_name has a high cardinality: 1702 distinct values High cardinality
start_station_id has a high cardinality: 1318 distinct values High cardinality
end_station_name has a high cardinality: 1725 distinct values High cardinality
end_station_id has a high cardinality: 1323 distinct values High cardinality
start_lat is highly correlated with start_lng and 1 other fieldsHigh correlation
start_lng is highly correlated with start_lat and 1 other fieldsHigh correlation
end_lat is highly correlated with start_lat and 1 other fieldsHigh correlation
end_lng is highly correlated with start_lng and 1 other fieldsHigh correlation
duration_in_mins is highly correlated with distance_kmHigh correlation
distance_km is highly correlated with duration_in_minsHigh correlation
start_lat is highly correlated with start_lng and 1 other fieldsHigh correlation
start_lng is highly correlated with start_latHigh correlation
end_lat is highly correlated with start_lat and 2 other fieldsHigh correlation
end_lng is highly correlated with end_lat and 1 other fieldsHigh correlation
distance_km is highly correlated with end_lat and 1 other fieldsHigh correlation
start_lat is highly correlated with end_latHigh correlation
start_lng is highly correlated with end_lngHigh correlation
end_lat is highly correlated with start_latHigh correlation
end_lng is highly correlated with start_lngHigh correlation
duration_in_mins is highly correlated with distance_kmHigh correlation
distance_km is highly correlated with duration_in_minsHigh correlation
start_lat is highly correlated with start_lngHigh correlation
start_lng is highly correlated with start_latHigh correlation
end_lat is highly correlated with end_lng and 1 other fieldsHigh correlation
end_lng is highly correlated with end_lat and 1 other fieldsHigh correlation
distance_km is highly correlated with end_lat and 1 other fieldsHigh correlation
start_station_name has 886328 (14.6%) missing values Missing
start_station_id has 886460 (14.6%) missing values Missing
end_station_name has 941355 (15.5%) missing values Missing
end_station_id has 941496 (15.5%) missing values Missing
end_lat is highly skewed (γ1 = -324.798497) Skewed
end_lng is highly skewed (γ1 = 771.2368006) Skewed
duration_in_mins is highly skewed (γ1 = 261.3644865) Skewed
distance_km is highly skewed (γ1 = 835.5510045) Skewed
ride_id is uniformly distributed Uniform
started_at is uniformly distributed Uniform
ended_at is uniformly distributed Uniform
ride_id has unique values Unique
duration_in_mins has 137843 (2.3%) zeros Zeros
distance_km has 328459 (5.4%) zeros Zeros

Reproduction

Analysis started2023-04-30 04:45:39.104121
Analysis finished2023-04-30 04:53:01.413675
Duration7 minutes and 22.31 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

ride_id
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct6081641
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size46.4 MiB
550CF7EFEAE0C618
 
1
D5D9F57D6DEE6573
 
1
8EA7198A672FD1BE
 
1
69FB93A62F116E80
 
1
870F0ED4E79871E5
 
1
Other values (6081636)
6081636 

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters97306256
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6081641 ?
Unique (%)100.0%

Sample

1st row550CF7EFEAE0C618
2nd rowDAD198F405F9C5F5
3rd rowE6F2BC47B65CB7FD
4th rowF597830181C2E13C
5th row0CE689BB4E313E8D

Common Values

ValueCountFrequency (%)
550CF7EFEAE0C6181
 
< 0.1%
D5D9F57D6DEE65731
 
< 0.1%
8EA7198A672FD1BE1
 
< 0.1%
69FB93A62F116E801
 
< 0.1%
870F0ED4E79871E51
 
< 0.1%
DA6B7A72B6FDF93F1
 
< 0.1%
3EF916074FFC35851
 
< 0.1%
7EA32BDFB6EEAC011
 
< 0.1%
F4BFA36847700CDC1
 
< 0.1%
EA716735F2E3DE701
 
< 0.1%
Other values (6081631)6081631
> 99.9%

Length

2023-04-30T12:53:02.136644image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
550cf7efeae0c6181
 
< 0.1%
d2f6d8f3feed68741
 
< 0.1%
68c474a4e92f24b61
 
< 0.1%
14a985a3838aa8cc1
 
< 0.1%
e724b94bce2e7e361
 
< 0.1%
1aa3756a6f8189db1
 
< 0.1%
1749361ac24efb771
 
< 0.1%
cbc1e1c17b07a9991
 
< 0.1%
3bbdd414f3f543ce1
 
< 0.1%
1edac7624ba69c981
 
< 0.1%
Other values (6081631)6081631
> 99.9%

Most occurring characters

ValueCountFrequency (%)
A6084749
 
6.3%
D6084627
 
6.3%
B6084436
 
6.3%
96083751
 
6.3%
36082816
 
6.3%
26082576
 
6.3%
C6082564
 
6.3%
66081493
 
6.2%
56080971
 
6.2%
06080899
 
6.2%
Other values (6)36477374
37.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number60809563
62.5%
Uppercase Letter36496693
37.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
96083751
10.0%
36082816
10.0%
26082576
10.0%
66081493
10.0%
56080971
10.0%
06080899
10.0%
76080752
10.0%
86079863
10.0%
16079159
10.0%
46077283
10.0%
Uppercase Letter
ValueCountFrequency (%)
A6084749
16.7%
D6084627
16.7%
B6084436
16.7%
C6082564
16.7%
F6080899
16.7%
E6079418
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common60809563
62.5%
Latin36496693
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
96083751
10.0%
36082816
10.0%
26082576
10.0%
66081493
10.0%
56080971
10.0%
06080899
10.0%
76080752
10.0%
86079863
10.0%
16079159
10.0%
46077283
10.0%
Latin
ValueCountFrequency (%)
A6084749
16.7%
D6084627
16.7%
B6084436
16.7%
C6082564
16.7%
F6080899
16.7%
E6079418
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII97306256
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A6084749
 
6.3%
D6084627
 
6.3%
B6084436
 
6.3%
96083751
 
6.3%
36082816
 
6.3%
26082576
 
6.3%
C6082564
 
6.3%
66081493
 
6.2%
56080971
 
6.2%
06080899
 
6.2%
Other values (6)36477374
37.5%

rideable_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size46.4 MiB
electric_bike
3131659 
classic_bike
2770634 
docked_bike
 
179348

Length

Max length13
Median length13
Mean length12.48544644
Min length11

Characters and Unicode

Total characters75932003
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowelectric_bike
2nd rowelectric_bike
3rd rowelectric_bike
4th rowelectric_bike
5th rowelectric_bike

Common Values

ValueCountFrequency (%)
electric_bike3131659
51.5%
classic_bike2770634
45.6%
docked_bike179348
 
2.9%

Length

2023-04-30T12:53:02.300347image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T12:53:02.442040image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
electric_bike3131659
51.5%
classic_bike2770634
45.6%
docked_bike179348
 
2.9%

Most occurring characters

ValueCountFrequency (%)
e12524307
16.5%
c11983934
15.8%
i11983934
15.8%
k6260989
8.2%
_6081641
8.0%
b6081641
8.0%
l5902293
7.8%
s5541268
7.3%
t3131659
 
4.1%
r3131659
 
4.1%
Other values (3)3308678
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter69850362
92.0%
Connector Punctuation6081641
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e12524307
17.9%
c11983934
17.2%
i11983934
17.2%
k6260989
9.0%
b6081641
8.7%
l5902293
8.4%
s5541268
7.9%
t3131659
 
4.5%
r3131659
 
4.5%
a2770634
 
4.0%
Other values (2)538044
 
0.8%
Connector Punctuation
ValueCountFrequency (%)
_6081641
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin69850362
92.0%
Common6081641
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e12524307
17.9%
c11983934
17.2%
i11983934
17.2%
k6260989
9.0%
b6081641
8.7%
l5902293
8.4%
s5541268
7.9%
t3131659
 
4.5%
r3131659
 
4.5%
a2770634
 
4.0%
Other values (2)538044
 
0.8%
Common
ValueCountFrequency (%)
_6081641
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII75932003
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e12524307
16.5%
c11983934
15.8%
i11983934
15.8%
k6260989
8.2%
_6081641
8.0%
b6081641
8.0%
l5902293
7.8%
s5541268
7.3%
t3131659
 
4.1%
r3131659
 
4.1%
Other values (3)3308678
 
4.4%

started_at
Categorical

HIGH CARDINALITY
UNIFORM

Distinct5125092
Distinct (%)84.3%
Missing0
Missing (%)0.0%
Memory size46.4 MiB
2022-05-30 13:05:15
 
9
2022-07-09 17:23:31
 
8
2022-10-24 16:59:45
 
8
2022-10-03 17:22:27
 
8
2022-08-10 17:25:13
 
7
Other values (5125087)
6081601 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters115551179
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4310611 ?
Unique (%)70.9%

Sample

1st row2022-08-07 21:34:15
2nd row2022-08-08 14:39:21
3rd row2022-08-08 15:29:50
4th row2022-08-08 02:43:50
5th row2022-08-07 20:24:06

Common Values

ValueCountFrequency (%)
2022-05-30 13:05:159
 
< 0.1%
2022-07-09 17:23:318
 
< 0.1%
2022-10-24 16:59:458
 
< 0.1%
2022-10-03 17:22:278
 
< 0.1%
2022-08-10 17:25:137
 
< 0.1%
2022-07-27 17:05:027
 
< 0.1%
2022-05-29 12:56:587
 
< 0.1%
2022-06-02 18:25:077
 
< 0.1%
2022-09-03 11:43:567
 
< 0.1%
2022-07-14 17:49:587
 
< 0.1%
Other values (5125082)6081566
> 99.9%

Length

2023-04-30T12:53:02.895622image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-07-0936104
 
0.3%
2022-04-2334893
 
0.3%
2022-06-1834255
 
0.3%
2022-05-2933324
 
0.3%
2022-07-3033288
 
0.3%
2022-06-2632925
 
0.3%
2022-07-1632507
 
0.3%
2022-06-1132474
 
0.3%
2022-08-2732172
 
0.3%
2022-05-2831885
 
0.3%
Other values (86743)11829455
97.3%

Most occurring characters

ValueCountFrequency (%)
225163436
21.8%
019070578
16.5%
112275204
10.6%
-12163282
10.5%
:12163282
10.5%
6081641
 
5.3%
35871259
 
5.1%
54894748
 
4.2%
44614811
 
4.0%
73465837
 
3.0%
Other values (3)9787101
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number85142974
73.7%
Dash Punctuation12163282
 
10.5%
Other Punctuation12163282
 
10.5%
Space Separator6081641
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
225163436
29.6%
019070578
22.4%
112275204
14.4%
35871259
 
6.9%
54894748
 
5.7%
44614811
 
5.4%
73465837
 
4.1%
83388201
 
4.0%
63234004
 
3.8%
93164896
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
-12163282
100.0%
Other Punctuation
ValueCountFrequency (%)
:12163282
100.0%
Space Separator
ValueCountFrequency (%)
6081641
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common115551179
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
225163436
21.8%
019070578
16.5%
112275204
10.6%
-12163282
10.5%
:12163282
10.5%
6081641
 
5.3%
35871259
 
5.1%
54894748
 
4.2%
44614811
 
4.0%
73465837
 
3.0%
Other values (3)9787101
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII115551179
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
225163436
21.8%
019070578
16.5%
112275204
10.6%
-12163282
10.5%
:12163282
10.5%
6081641
 
5.3%
35871259
 
5.1%
54894748
 
4.2%
44614811
 
4.0%
73465837
 
3.0%
Other values (3)9787101
 
8.5%

ended_at
Categorical

HIGH CARDINALITY
UNIFORM

Distinct5138667
Distinct (%)84.5%
Missing0
Missing (%)0.0%
Memory size46.4 MiB
2022-08-22 12:47:49
 
20
2022-06-22 08:01:59
 
14
2022-06-07 16:12:09
 
11
2023-03-22 17:30:49
 
11
2023-03-21 16:59:40
 
10
Other values (5138662)
6081575 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters115551179
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4331896 ?
Unique (%)71.2%

Sample

1st row2022-08-07 21:41:46
2nd row2022-08-08 14:53:23
3rd row2022-08-08 15:40:34
4th row2022-08-08 02:58:53
5th row2022-08-07 20:29:58

Common Values

ValueCountFrequency (%)
2022-08-22 12:47:4920
 
< 0.1%
2022-06-22 08:01:5914
 
< 0.1%
2022-06-07 16:12:0911
 
< 0.1%
2023-03-22 17:30:4911
 
< 0.1%
2023-03-21 16:59:4010
 
< 0.1%
2022-08-19 15:02:429
 
< 0.1%
2022-08-22 12:47:509
 
< 0.1%
2022-09-06 17:51:398
 
< 0.1%
2022-06-03 17:34:168
 
< 0.1%
2022-08-19 16:21:178
 
< 0.1%
Other values (5138657)6081533
> 99.9%

Length

2023-04-30T12:53:03.362892image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-07-0936010
 
0.3%
2022-04-2334575
 
0.3%
2022-06-1834283
 
0.3%
2022-05-2933300
 
0.3%
2022-07-3033245
 
0.3%
2022-06-2633214
 
0.3%
2022-06-1132386
 
0.3%
2022-07-1632348
 
0.3%
2022-08-2732041
 
0.3%
2022-09-1031849
 
0.3%
Other values (86742)11830031
97.3%

Most occurring characters

ValueCountFrequency (%)
225238570
21.8%
019067520
16.5%
112207979
10.6%
-12163282
10.5%
:12163282
10.5%
6081641
 
5.3%
35872180
 
5.1%
54919220
 
4.3%
44586264
 
4.0%
73442878
 
3.0%
Other values (3)9808363
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number85142974
73.7%
Dash Punctuation12163282
 
10.5%
Other Punctuation12163282
 
10.5%
Space Separator6081641
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
225238570
29.6%
019067520
22.4%
112207979
14.3%
35872180
 
6.9%
54919220
 
5.8%
44586264
 
5.4%
73442878
 
4.0%
83416344
 
4.0%
93198918
 
3.8%
63193101
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
-12163282
100.0%
Other Punctuation
ValueCountFrequency (%)
:12163282
100.0%
Space Separator
ValueCountFrequency (%)
6081641
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common115551179
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
225238570
21.8%
019067520
16.5%
112207979
10.6%
-12163282
10.5%
:12163282
10.5%
6081641
 
5.3%
35872180
 
5.1%
54919220
 
4.3%
44586264
 
4.0%
73442878
 
3.0%
Other values (3)9808363
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII115551179
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
225238570
21.8%
019067520
16.5%
112207979
10.6%
-12163282
10.5%
:12163282
10.5%
6081641
 
5.3%
35872180
 
5.1%
54919220
 
4.3%
44586264
 
4.0%
73442878
 
3.0%
Other values (3)9808363
 
8.5%

start_station_name
Categorical

HIGH CARDINALITY
MISSING

Distinct1702
Distinct (%)< 0.1%
Missing886328
Missing (%)14.6%
Memory size46.4 MiB
Streeter Dr & Grand Ave
 
77051
DuSable Lake Shore Dr & Monroe St
 
42405
Michigan Ave & Oak St
 
40848
DuSable Lake Shore Dr & North Blvd
 
40810
Wells St & Concord Ln
 
39990
Other values (1697)
4954209 

Length

Max length64
Median length50
Mean length23.91537218
Min length7

Characters and Unicode

Total characters124247844
Distinct characters72
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique111 ?
Unique (%)< 0.1%

Sample

1st rowDuSable Museum
2nd rowRobert Fulton Elementary School
3rd rowCalifornia Ave & Milwaukee Ave
4th rowCampbell Ave & Montrose Ave
5th rowCalifornia Ave & Division St

Common Values

ValueCountFrequency (%)
Streeter Dr & Grand Ave77051
 
1.3%
DuSable Lake Shore Dr & Monroe St42405
 
0.7%
Michigan Ave & Oak St40848
 
0.7%
DuSable Lake Shore Dr & North Blvd40810
 
0.7%
Wells St & Concord Ln39990
 
0.7%
Clark St & Elm St37805
 
0.6%
Kingsbury St & Kinzie St36412
 
0.6%
Millennium Park36298
 
0.6%
Wells St & Elm St33843
 
0.6%
Theater on the Lake33702
 
0.6%
Other values (1692)4776149
78.5%
(Missing)886328
 
14.6%

Length

2023-04-30T12:53:03.491622image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4936954
18.9%
st4552158
17.4%
ave3673605
 
14.1%
dr379865
 
1.5%
clark378874
 
1.4%
lake334908
 
1.3%
blvd312073
 
1.2%
rd264340
 
1.0%
wells202036
 
0.8%
park192172
 
0.7%
Other values (775)10906111
41.7%

Most occurring characters

ValueCountFrequency (%)
20938019
16.9%
e10763904
 
8.7%
t8122379
 
6.5%
a6916077
 
5.6%
r5934353
 
4.8%
n5806928
 
4.7%
S5758029
 
4.6%
l5365248
 
4.3%
o5151405
 
4.1%
&4907807
 
4.0%
Other values (62)44583695
35.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter76211229
61.3%
Space Separator20938019
 
16.9%
Uppercase Letter20905377
 
16.8%
Other Punctuation4983328
 
4.0%
Decimal Number994491
 
0.8%
Open Punctuation92983
 
0.1%
Close Punctuation92983
 
0.1%
Dash Punctuation29434
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e10763904
14.1%
t8122379
10.7%
a6916077
9.1%
r5934353
 
7.8%
n5806928
 
7.6%
l5365248
 
7.0%
o5151405
 
6.8%
v4716835
 
6.2%
i4378690
 
5.7%
d2899451
 
3.8%
Other values (16)16155959
21.2%
Uppercase Letter
ValueCountFrequency (%)
S5758029
27.5%
A4136147
19.8%
C1429846
 
6.8%
W1277451
 
6.1%
D1119317
 
5.4%
L948692
 
4.5%
M887762
 
4.2%
B885771
 
4.2%
P791186
 
3.8%
R786337
 
3.8%
Other values (16)2884839
13.8%
Decimal Number
ValueCountFrequency (%)
5217034
21.8%
3168381
16.9%
1162035
16.3%
896490
9.7%
969314
 
7.0%
665720
 
6.6%
062352
 
6.3%
755243
 
5.6%
252809
 
5.3%
445113
 
4.5%
Other Punctuation
ValueCountFrequency (%)
&4907807
98.5%
*43119
 
0.9%
.31592
 
0.6%
/801
 
< 0.1%
"6
 
< 0.1%
;3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
20938019
100.0%
Open Punctuation
ValueCountFrequency (%)
(92983
100.0%
Close Punctuation
ValueCountFrequency (%)
)92983
100.0%
Dash Punctuation
ValueCountFrequency (%)
-29434
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin97116606
78.2%
Common27131238
 
21.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e10763904
 
11.1%
t8122379
 
8.4%
a6916077
 
7.1%
r5934353
 
6.1%
n5806928
 
6.0%
S5758029
 
5.9%
l5365248
 
5.5%
o5151405
 
5.3%
v4716835
 
4.9%
i4378690
 
4.5%
Other values (42)34202758
35.2%
Common
ValueCountFrequency (%)
20938019
77.2%
&4907807
 
18.1%
5217034
 
0.8%
3168381
 
0.6%
1162035
 
0.6%
896490
 
0.4%
(92983
 
0.3%
)92983
 
0.3%
969314
 
0.3%
665720
 
0.2%
Other values (10)320472
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII124247844
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20938019
16.9%
e10763904
 
8.7%
t8122379
 
6.5%
a6916077
 
5.6%
r5934353
 
4.8%
n5806928
 
4.7%
S5758029
 
4.6%
l5365248
 
4.3%
o5151405
 
4.1%
&4907807
 
4.0%
Other values (62)44583695
35.9%

start_station_id
Categorical

HIGH CARDINALITY
MISSING

Distinct1318
Distinct (%)< 0.1%
Missing886460
Missing (%)14.6%
Memory size46.4 MiB
13022
 
77051
13300
 
42405
13042
 
40848
LF-005
 
40810
TA1308000050
 
39990
Other values (1313)
4954077 

Length

Max length37
Median length36
Mean length8.416704827
Min length3

Characters and Unicode

Total characters43726305
Distinct characters58
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)< 0.1%

Sample

1st rowKA1503000075
2nd row819
3rd row13084
4th row15623
5th row13256

Common Values

ValueCountFrequency (%)
1302277051
 
1.3%
1330042405
 
0.7%
1304240848
 
0.7%
LF-00540810
 
0.7%
TA130800005039990
 
0.7%
TA130700003937805
 
0.6%
KA150300004336412
 
0.6%
1300836298
 
0.6%
KA150400013533843
 
0.6%
TA130800000133702
 
0.6%
Other values (1308)4776017
78.5%
(Missing)886460
 
14.6%

Length

2023-04-30T12:53:03.629432image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1302277051
 
1.5%
1330042405
 
0.8%
1304240848
 
0.8%
lf-00540810
 
0.8%
ta130800005039990
 
0.8%
ta130700003937805
 
0.7%
ka150300004336412
 
0.7%
1300836298
 
0.7%
ka150400013533843
 
0.7%
ta130800000133702
 
0.6%
Other values (1325)4780531
91.9%

Most occurring characters

ValueCountFrequency (%)
014161433
32.4%
17005928
16.0%
35579234
 
12.8%
A2529047
 
5.8%
52368241
 
5.4%
T2008835
 
4.6%
21805677
 
4.1%
41541706
 
3.5%
61476587
 
3.4%
71357654
 
3.1%
Other values (48)3891963
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number37352600
85.4%
Uppercase Letter5583724
 
12.8%
Lowercase Letter510636
 
1.2%
Dash Punctuation236376
 
0.5%
Other Punctuation34587
 
0.1%
Space Separator4514
 
< 0.1%
Close Punctuation1934
 
< 0.1%
Open Punctuation1934
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A2529047
45.3%
T2008835
36.0%
K537582
 
9.6%
L202339
 
3.6%
S102368
 
1.8%
W50115
 
0.9%
P41760
 
0.7%
F40810
 
0.7%
R30117
 
0.5%
N15103
 
0.3%
Other values (11)25648
 
0.5%
Lowercase Letter
ValueCountFrequency (%)
g88369
17.3%
i47327
9.3%
c46999
9.2%
a45478
8.9%
n45326
8.9%
r45239
8.9%
h45215
8.9%
t43540
8.5%
s43520
8.5%
x43119
8.4%
Other values (11)16504
 
3.2%
Decimal Number
ValueCountFrequency (%)
014161433
37.9%
17005928
18.8%
35579234
 
14.9%
52368241
 
6.3%
21805677
 
4.8%
41541706
 
4.1%
61476587
 
4.0%
71357654
 
3.6%
91235770
 
3.3%
8820370
 
2.2%
Other Punctuation
ValueCountFrequency (%)
.34551
99.9%
&36
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
-236376
100.0%
Space Separator
ValueCountFrequency (%)
4514
100.0%
Close Punctuation
ValueCountFrequency (%)
)1934
100.0%
Open Punctuation
ValueCountFrequency (%)
(1934
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common37631945
86.1%
Latin6094360
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A2529047
41.5%
T2008835
33.0%
K537582
 
8.8%
L202339
 
3.3%
S102368
 
1.7%
g88369
 
1.5%
W50115
 
0.8%
i47327
 
0.8%
c46999
 
0.8%
a45478
 
0.7%
Other values (32)435901
 
7.2%
Common
ValueCountFrequency (%)
014161433
37.6%
17005928
18.6%
35579234
 
14.8%
52368241
 
6.3%
21805677
 
4.8%
41541706
 
4.1%
61476587
 
3.9%
71357654
 
3.6%
91235770
 
3.3%
8820370
 
2.2%
Other values (6)279345
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII43726305
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
014161433
32.4%
17005928
16.0%
35579234
 
12.8%
A2529047
 
5.8%
52368241
 
5.4%
T2008835
 
4.6%
21805677
 
4.1%
41541706
 
3.5%
61476587
 
3.4%
71357654
 
3.1%
Other values (48)3891963
 
8.9%

end_station_name
Categorical

HIGH CARDINALITY
MISSING

Distinct1725
Distinct (%)< 0.1%
Missing941355
Missing (%)15.5%
Memory size46.4 MiB
Streeter Dr & Grand Ave
 
77569
DuSable Lake Shore Dr & North Blvd
 
42978
Michigan Ave & Oak St
 
41397
DuSable Lake Shore Dr & Monroe St
 
41168
Wells St & Concord Ln
 
40053
Other values (1720)
4897121 

Length

Max length64
Median length50
Mean length23.92562301
Min length9

Characters and Unicode

Total characters122984545
Distinct characters71
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique129 ?
Unique (%)< 0.1%

Sample

1st rowWestern Ave & Roscoe St
2nd rowWestern Ave & 105th St
3rd rowCottage Grove Ave & 51st St
4th rowCampbell Ave & Montrose Ave
5th rowCalifornia Ave & Milwaukee Ave

Common Values

ValueCountFrequency (%)
Streeter Dr & Grand Ave77569
 
1.3%
DuSable Lake Shore Dr & North Blvd42978
 
0.7%
Michigan Ave & Oak St41397
 
0.7%
DuSable Lake Shore Dr & Monroe St41168
 
0.7%
Wells St & Concord Ln40053
 
0.7%
Clark St & Elm St37216
 
0.6%
Millennium Park36787
 
0.6%
Kingsbury St & Kinzie St35217
 
0.6%
Theater on the Lake33844
 
0.6%
Wells St & Elm St32767
 
0.5%
Other values (1715)4721290
77.6%
(Missing)941355
 
15.5%

Length

2023-04-30T12:53:03.763926image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4889402
18.9%
st4491116
17.4%
ave3667362
 
14.2%
clark374302
 
1.4%
dr373823
 
1.4%
lake335533
 
1.3%
blvd307603
 
1.2%
rd261072
 
1.0%
wells193303
 
0.7%
park189759
 
0.7%
Other values (777)10773964
41.7%

Most occurring characters

ValueCountFrequency (%)
20717213
16.8%
e10701986
 
8.7%
t8029001
 
6.5%
a6832163
 
5.6%
r5871803
 
4.8%
n5746719
 
4.7%
S5691320
 
4.6%
l5309195
 
4.3%
o5113172
 
4.2%
&4861467
 
4.0%
Other values (61)44110506
35.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter75461147
61.4%
Space Separator20717213
 
16.8%
Uppercase Letter20680332
 
16.8%
Other Punctuation4940441
 
4.0%
Decimal Number974994
 
0.8%
Close Punctuation91141
 
0.1%
Open Punctuation91141
 
0.1%
Dash Punctuation28136
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e10701986
14.2%
t8029001
10.6%
a6832163
9.1%
r5871803
 
7.8%
n5746719
 
7.6%
l5309195
 
7.0%
o5113172
 
6.8%
v4701198
 
6.2%
i4330332
 
5.7%
d2867629
 
3.8%
Other values (16)15957949
21.1%
Uppercase Letter
ValueCountFrequency (%)
S5691320
27.5%
A4122411
19.9%
C1413570
 
6.8%
W1255349
 
6.1%
D1109389
 
5.4%
L945821
 
4.6%
M878274
 
4.2%
B876836
 
4.2%
R780881
 
3.8%
P773865
 
3.7%
Other values (16)2832616
13.7%
Decimal Number
ValueCountFrequency (%)
5214224
22.0%
3166676
17.1%
1159632
16.4%
894672
9.7%
966882
 
6.9%
664091
 
6.6%
060322
 
6.2%
754873
 
5.6%
249302
 
5.1%
444320
 
4.5%
Other Punctuation
ValueCountFrequency (%)
&4861467
98.4%
*45957
 
0.9%
.32295
 
0.7%
/712
 
< 0.1%
"10
 
< 0.1%
Space Separator
ValueCountFrequency (%)
20717213
100.0%
Close Punctuation
ValueCountFrequency (%)
)91141
100.0%
Open Punctuation
ValueCountFrequency (%)
(91141
100.0%
Dash Punctuation
ValueCountFrequency (%)
-28136
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin96141479
78.2%
Common26843066
 
21.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e10701986
 
11.1%
t8029001
 
8.4%
a6832163
 
7.1%
r5871803
 
6.1%
n5746719
 
6.0%
S5691320
 
5.9%
l5309195
 
5.5%
o5113172
 
5.3%
v4701198
 
4.9%
i4330332
 
4.5%
Other values (42)33814590
35.2%
Common
ValueCountFrequency (%)
20717213
77.2%
&4861467
 
18.1%
5214224
 
0.8%
3166676
 
0.6%
1159632
 
0.6%
894672
 
0.4%
)91141
 
0.3%
(91141
 
0.3%
966882
 
0.2%
664091
 
0.2%
Other values (9)315927
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII122984545
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20717213
16.8%
e10701986
 
8.7%
t8029001
 
6.5%
a6832163
 
5.6%
r5871803
 
4.8%
n5746719
 
4.7%
S5691320
 
4.6%
l5309195
 
4.3%
o5113172
 
4.2%
&4861467
 
4.0%
Other values (61)44110506
35.9%

end_station_id
Categorical

HIGH CARDINALITY
MISSING

Distinct1323
Distinct (%)< 0.1%
Missing941496
Missing (%)15.5%
Memory size46.4 MiB
13022
 
77569
LF-005
 
42978
13042
 
41397
13300
 
41168
TA1308000050
 
40053
Other values (1318)
4896980 

Length

Max length37
Median length36
Mean length8.422088482
Min length3

Characters and Unicode

Total characters43290756
Distinct characters58
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)< 0.1%

Sample

1st row15634
2nd row691
3rd rowTA1309000067
4th row15623
5th row13084

Common Values

ValueCountFrequency (%)
1302277569
 
1.3%
LF-00542978
 
0.7%
1304241397
 
0.7%
1330041168
 
0.7%
TA130800005040053
 
0.7%
TA130700003937216
 
0.6%
1300836787
 
0.6%
KA150300004335217
 
0.6%
TA130800000133844
 
0.6%
KA150400013532767
 
0.5%
Other values (1313)4721149
77.6%
(Missing)941496
 
15.5%

Length

2023-04-30T12:53:03.890507image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1302277569
 
1.5%
lf-00542978
 
0.8%
1304241397
 
0.8%
1330041168
 
0.8%
ta130800005040053
 
0.8%
ta130700003937216
 
0.7%
1300836787
 
0.7%
ka150300004335217
 
0.7%
ta130800000133844
 
0.7%
ka150400013532767
 
0.6%
Other values (1330)4721953
91.8%

Most occurring characters

ValueCountFrequency (%)
014029638
32.4%
16956014
16.1%
35524161
 
12.8%
A2509406
 
5.8%
52352498
 
5.4%
T1989789
 
4.6%
21773498
 
4.1%
41528660
 
3.5%
61454611
 
3.4%
71344389
 
3.1%
Other values (48)3828092
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number37004902
85.5%
Uppercase Letter5516043
 
12.7%
Lowercase Letter510766
 
1.2%
Dash Punctuation227455
 
0.5%
Other Punctuation30336
 
0.1%
Space Separator804
 
< 0.1%
Close Punctuation225
 
< 0.1%
Open Punctuation225
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A2509406
45.5%
T1989789
36.1%
K532678
 
9.7%
L196907
 
3.6%
S93108
 
1.7%
W48502
 
0.9%
F42978
 
0.8%
P40699
 
0.7%
R30063
 
0.5%
N14882
 
0.3%
Other values (11)17031
 
0.3%
Lowercase Letter
ValueCountFrequency (%)
g92247
18.1%
i46558
9.1%
c46427
9.1%
a46320
9.1%
r46268
9.1%
n46258
9.1%
h46248
9.1%
t46059
9.0%
s46042
9.0%
x45959
9.0%
Other values (11)2380
 
0.5%
Decimal Number
ValueCountFrequency (%)
014029638
37.9%
16956014
18.8%
35524161
 
14.9%
52352498
 
6.4%
21773498
 
4.8%
41528660
 
4.1%
61454611
 
3.9%
71344389
 
3.6%
91224906
 
3.3%
8816527
 
2.2%
Other Punctuation
ValueCountFrequency (%)
.30298
99.9%
&38
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
-227455
100.0%
Space Separator
ValueCountFrequency (%)
804
100.0%
Close Punctuation
ValueCountFrequency (%)
)225
100.0%
Open Punctuation
ValueCountFrequency (%)
(225
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common37263947
86.1%
Latin6026809
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A2509406
41.6%
T1989789
33.0%
K532678
 
8.8%
L196907
 
3.3%
S93108
 
1.5%
g92247
 
1.5%
W48502
 
0.8%
i46558
 
0.8%
c46427
 
0.8%
a46320
 
0.8%
Other values (32)424867
 
7.0%
Common
ValueCountFrequency (%)
014029638
37.6%
16956014
18.7%
35524161
 
14.8%
52352498
 
6.3%
21773498
 
4.8%
41528660
 
4.1%
61454611
 
3.9%
71344389
 
3.6%
91224906
 
3.3%
8816527
 
2.2%
Other values (6)259045
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII43290756
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
014029638
32.4%
16956014
16.1%
35524161
 
12.8%
A2509406
 
5.8%
52352498
 
5.4%
T1989789
 
4.6%
21773498
 
4.1%
41528660
 
3.5%
61454611
 
3.4%
71344389
 
3.1%
Other values (48)3828092
 
8.8%

start_lat
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct722564
Distinct (%)11.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.90206158
Minimum41.64
Maximum42.07
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 MiB
2023-04-30T12:53:04.036431image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum41.64
5-th percentile41.8
Q141.88096617
median41.89993001
Q341.93
95-th percentile41.97
Maximum42.07
Range0.43
Interquartile range (IQR)0.04903383333

Descriptive statistics

Standard deviation0.0462265043
Coefficient of variation (CV)0.001103203579
Kurtosis2.14375908
Mean41.90206158
Median Absolute Deviation (MAD)0.02274901
Skewness-0.4768937608
Sum254833295.7
Variance0.0021368897
MonotonicityNot monotonic
2023-04-30T12:53:04.161616image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41.9399695
 
1.6%
41.8995050
 
1.6%
41.9292209
 
1.5%
41.9483274
 
1.4%
41.9181587
 
1.3%
41.973058
 
1.2%
41.8864844
 
1.1%
41.9563830
 
1.0%
41.89227853873
 
0.9%
41.7947505
 
0.8%
Other values (722554)5326716
87.6%
ValueCountFrequency (%)
41.643
 
< 0.1%
41.6485007685
< 0.1%
41.6485018
 
< 0.1%
41.648530671
 
< 0.1%
41.648547171
 
< 0.1%
41.6485521
 
< 0.1%
41.648554171
 
< 0.1%
41.648554831
 
< 0.1%
41.648557831
 
< 0.1%
41.648559451
 
< 0.1%
ValueCountFrequency (%)
42.07305
< 0.1%
42.064869171
 
< 0.1%
42.064856331
 
< 0.1%
42.064854350
< 0.1%
42.064850331
 
< 0.1%
42.06484351
 
< 0.1%
42.06484251
 
< 0.1%
42.064839961
 
< 0.1%
42.064837671
 
< 0.1%
42.064828331
 
< 0.1%

start_lng
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct682475
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-87.64779682
Minimum-87.84
Maximum-87.52
Zeros0
Zeros (%)0.0%
Negative6081641
Negative (%)100.0%
Memory size46.4 MiB
2023-04-30T12:53:04.301654image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-87.84
5-th percentile-87.7
Q1-87.661501
median-87.64410555
Q3-87.629634
95-th percentile-87.607267
Maximum-87.52
Range0.32
Interquartile range (IQR)0.031867

Descriptive statistics

Standard deviation0.02920161611
Coefficient of variation (CV)-0.0003331699959
Kurtosis2.910488901
Mean-87.64779682
Median Absolute Deviation (MAD)0.015894446
Skewness-0.9905368322
Sum-533042434.7
Variance0.0008527343832
MonotonicityNot monotonic
2023-04-30T12:53:04.432113image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-87.65138302
 
2.3%
-87.63103546
 
1.7%
-87.6491466
 
1.5%
-87.6684833
 
1.4%
-87.6773781
 
1.2%
-87.6968184
 
1.1%
-87.6863469
 
1.0%
-87.759692
 
1.0%
-87.61204353859
 
0.9%
-87.6251929
 
0.9%
Other values (682465)5292580
87.0%
ValueCountFrequency (%)
-87.8422
 
< 0.1%
-87.83332051
 
< 0.1%
-87.83330851
 
< 0.1%
-87.83326851
 
< 0.1%
-87.833254171
 
< 0.1%
-87.833248831
 
< 0.1%
-87.833235171
 
< 0.1%
-87.833228671
 
< 0.1%
-87.83675
< 0.1%
-87.82931151
 
< 0.1%
ValueCountFrequency (%)
-87.5230
 
< 0.1%
-87.5253141
 
< 0.1%
-87.52823174142
< 0.1%
-87.52823228
 
< 0.1%
-87.528330831
 
< 0.1%
-87.5283551
 
< 0.1%
-87.528365831
 
< 0.1%
-87.528371
 
< 0.1%
-87.528370831
 
< 0.1%
-87.528380331
 
< 0.1%

end_lat
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1634
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.9022444
Minimum0
Maximum42.37
Zeros8
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size46.4 MiB
2023-04-30T12:53:04.573325image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile41.8
Q141.8810317
median41.9
Q341.93
95-th percentile41.97
Maximum42.37
Range42.37
Interquartile range (IQR)0.0489683

Descriptive statistics

Standard deviation0.06680262885
Coefficient of variation (CV)0.001594249421
Kurtosis203629.5423
Mean41.9022444
Median Absolute Deviation (MAD)0.022819
Skewness-324.798497
Sum254834407.5
Variance0.004462591221
MonotonicityNot monotonic
2023-04-30T12:53:04.704665image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41.93101809
 
1.7%
41.89101481
 
1.7%
41.9294602
 
1.6%
41.9185113
 
1.4%
41.9484475
 
1.4%
41.89227877550
 
1.3%
41.976158
 
1.3%
41.8870296
 
1.2%
41.9565069
 
1.1%
41.7948395
 
0.8%
Other values (1624)5276693
86.8%
ValueCountFrequency (%)
08
 
< 0.1%
41.554
 
< 0.1%
41.591
 
< 0.1%
41.62
 
< 0.1%
41.611
 
< 0.1%
41.623
 
< 0.1%
41.635
 
< 0.1%
41.6411
 
< 0.1%
41.6485007689
< 0.1%
41.6485014
 
< 0.1%
ValueCountFrequency (%)
42.371
 
< 0.1%
42.192
 
< 0.1%
42.151
 
< 0.1%
42.132
 
< 0.1%
42.122
 
< 0.1%
42.115
 
< 0.1%
42.12
 
< 0.1%
42.093
 
< 0.1%
42.0841
 
< 0.1%
42.07385
< 0.1%

end_lng
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1619
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-87.64786739
Minimum-88.14
Maximum0
Zeros8
Zeros (%)< 0.1%
Negative6081633
Negative (%)> 99.9%
Memory size46.4 MiB
2023-04-30T12:53:04.848843image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-88.14
5-th percentile-87.7
Q1-87.662007
median-87.64414
Q3-87.62974631
95-th percentile-87.607267
Maximum0
Range88.14
Interquartile range (IQR)0.0322606948

Descriptive statistics

Standard deviation0.1047203596
Coefficient of variation (CV)-0.001194785027
Kurtosis645518.4026
Mean-87.64786739
Median Absolute Deviation (MAD)0.01586
Skewness771.2368006
Sum-533042863.9
Variance0.01096635371
MonotonicityNot monotonic
2023-04-30T12:53:04.975253image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-87.65143424
 
2.4%
-87.63111969
 
1.8%
-87.6496507
 
1.6%
-87.6688477
 
1.5%
-87.61204377550
 
1.3%
-87.6777432
 
1.3%
-87.6968840
 
1.1%
-87.6865148
 
1.1%
-87.759483
 
1.0%
-87.6257626
 
0.9%
Other values (1609)5235185
86.1%
ValueCountFrequency (%)
-88.141
 
< 0.1%
-88.051
 
< 0.1%
-87.931
 
< 0.1%
-87.924
< 0.1%
-87.91
 
< 0.1%
-87.893
 
< 0.1%
-87.884
< 0.1%
-87.873
 
< 0.1%
-87.868
< 0.1%
-87.857
< 0.1%
ValueCountFrequency (%)
08
 
< 0.1%
-87.34
 
< 0.1%
-87.51
 
< 0.1%
-87.516
 
< 0.1%
-87.5252
 
< 0.1%
-87.52823174223
< 0.1%
-87.52823230
 
< 0.1%
-87.528451171
 
< 0.1%
-87.53332
< 0.1%
-87.53043190
 
< 0.1%

member_casual
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size46.4 MiB
member
3659726 
casual
2421915 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters36489846
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcasual
2nd rowcasual
3rd rowcasual
4th rowcasual
5th rowcasual

Common Values

ValueCountFrequency (%)
member3659726
60.2%
casual2421915
39.8%

Length

2023-04-30T12:53:05.094561image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T12:53:05.199741image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
member3659726
60.2%
casual2421915
39.8%

Most occurring characters

ValueCountFrequency (%)
m7319452
20.1%
e7319452
20.1%
a4843830
13.3%
b3659726
10.0%
r3659726
10.0%
c2421915
 
6.6%
s2421915
 
6.6%
u2421915
 
6.6%
l2421915
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter36489846
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m7319452
20.1%
e7319452
20.1%
a4843830
13.3%
b3659726
10.0%
r3659726
10.0%
c2421915
 
6.6%
s2421915
 
6.6%
u2421915
 
6.6%
l2421915
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Latin36489846
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m7319452
20.1%
e7319452
20.1%
a4843830
13.3%
b3659726
10.0%
r3659726
10.0%
c2421915
 
6.6%
s2421915
 
6.6%
u2421915
 
6.6%
l2421915
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII36489846
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m7319452
20.1%
e7319452
20.1%
a4843830
13.3%
b3659726
10.0%
r3659726
10.0%
c2421915
 
6.6%
s2421915
 
6.6%
u2421915
 
6.6%
l2421915
 
6.6%

duration_in_mins
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct1495
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.41898691
Minimum-10354
Maximum34354
Zeros137843
Zeros (%)2.3%
Negative101
Negative (%)< 0.1%
Memory size46.4 MiB
2023-04-30T12:53:05.314092image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-10354
5-th percentile2
Q15
median10
Q317
95-th percentile43
Maximum34354
Range44708
Interquartile range (IQR)12

Descriptive statistics

Standard deviation37.75881048
Coefficient of variation (CV)2.448851581
Kurtosis207129.8497
Mean15.41898691
Median Absolute Deviation (MAD)5
Skewness261.3644865
Sum93772743
Variance1425.727769
MonotonicityNot monotonic
2023-04-30T12:53:05.442173image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5404947
 
6.7%
4402503
 
6.6%
6387523
 
6.4%
7361315
 
5.9%
3354935
 
5.8%
8331728
 
5.5%
9302725
 
5.0%
10273087
 
4.5%
11242918
 
4.0%
2241929
 
4.0%
Other values (1485)2778031
45.7%
ValueCountFrequency (%)
-103541
< 0.1%
-1691
< 0.1%
-1381
< 0.1%
-1311
< 0.1%
-1301
< 0.1%
-1281
< 0.1%
-1271
< 0.1%
-1261
< 0.1%
-1181
< 0.1%
-1021
< 0.1%
ValueCountFrequency (%)
343541
< 0.1%
320351
< 0.1%
139281
< 0.1%
108071
< 0.1%
107221
< 0.1%
101261
< 0.1%
99621
< 0.1%
87551
< 0.1%
82431
< 0.1%
75451
< 0.1%

distance_km
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct114173
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.11474072
Minimum0
Maximum9814.083
Zeros328459
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size46.4 MiB
2023-04-30T12:53:05.580570image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.865
median1.552
Q32.7543
95-th percentile5.8655
Maximum9814.083
Range9814.083
Interquartile range (IQR)1.8893

Descriptive statistics

Standard deviation11.41293769
Coefficient of variation (CV)5.396849638
Kurtosis718233.5981
Mean2.11474072
Median Absolute Deviation (MAD)0.8269
Skewness835.5510045
Sum12861093.87
Variance130.2551467
MonotonicityNot monotonic
2023-04-30T12:53:05.711462image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0328459
 
5.4%
1.11260088
 
1.0%
2.223916388
 
0.3%
0.716613431
 
0.2%
1.023912274
 
0.2%
1.38612191
 
0.2%
1.38711299
 
0.2%
0.82727393
 
0.1%
1.31757237
 
0.1%
0.82916832
 
0.1%
Other values (114163)5606049
92.2%
ValueCountFrequency (%)
0328459
5.4%
0.000128
 
< 0.1%
0.000248
 
< 0.1%
0.000378
 
< 0.1%
0.0004124
 
< 0.1%
0.0005112
 
< 0.1%
0.0006159
 
< 0.1%
0.0007182
 
< 0.1%
0.0008193
 
< 0.1%
0.0009227
 
< 0.1%
ValueCountFrequency (%)
9814.0831
< 0.1%
9813.39111
< 0.1%
9813.08581
< 0.1%
9812.93131
< 0.1%
9812.92991
< 0.1%
9812.18791
< 0.1%
9811.82431
< 0.1%
9811.52461
< 0.1%
42.27221
< 0.1%
37.67871
< 0.1%

Interactions

2023-04-30T12:51:37.462043image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:00.585387image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:08.202766image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:15.905264image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:23.207160image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:30.331174image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:38.661482image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:01.852975image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:09.460369image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:17.110022image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:24.342310image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:31.518579image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:39.873398image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:03.117722image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:10.752493image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:18.274892image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:25.532589image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:32.723775image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:41.090252image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:04.334690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:12.043575image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:19.545666image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:26.685984image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:33.938309image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:42.294850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:05.539930image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:13.337385image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:20.881254image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:27.918939image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:35.054289image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:43.406389image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:06.910429image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:14.620413image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:22.018452image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:29.090526image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T12:51:36.252315image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-04-30T12:53:05.821225image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-04-30T12:53:06.007395image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-04-30T12:53:06.145127image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-04-30T12:53:06.272385image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-04-30T12:53:06.383632image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-04-30T12:51:54.252556image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-30T12:52:06.018825image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-04-30T12:52:31.476463image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2023-04-30T12:52:39.860865image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

ride_idrideable_typestarted_atended_atstart_station_namestart_station_idend_station_nameend_station_idstart_latstart_lngend_latend_lngmember_casualduration_in_minsdistance_km
0550CF7EFEAE0C618electric_bike2022-08-07 21:34:152022-08-07 21:41:46NaNNaNNaNNaN41.93-87.6941.94-87.72casual7.02.7193
1DAD198F405F9C5F5electric_bike2022-08-08 14:39:212022-08-08 14:53:23NaNNaNNaNNaN41.89-87.6441.92-87.64casual14.03.3359
2E6F2BC47B65CB7FDelectric_bike2022-08-08 15:29:502022-08-08 15:40:34NaNNaNNaNNaN41.97-87.6941.97-87.66casual10.02.4802
3F597830181C2E13Celectric_bike2022-08-08 02:43:502022-08-08 02:58:53NaNNaNNaNNaN41.94-87.6541.97-87.69casual15.04.6977
40CE689BB4E313E8Delectric_bike2022-08-07 20:24:062022-08-07 20:29:58NaNNaNNaNNaN41.85-87.6541.84-87.66casual5.01.3866
5BFA7E7CC69860C20electric_bike2022-08-08 13:06:082022-08-08 13:19:09NaNNaNNaNNaN41.79-87.7241.82-87.69casual13.04.1607
668C474A4E92F24B6electric_bike2022-08-08 14:02:402022-08-08 14:11:36NaNNaNNaNNaN41.89-87.6341.89-87.61casual8.01.6555
714A985A3838AA8CCelectric_bike2022-08-07 20:56:172022-08-07 21:14:14NaNNaNNaNNaN41.96-87.6841.92-87.68casual17.04.4478
8E724B94BCE2E7E36electric_bike2022-08-07 21:30:052022-08-07 21:41:28NaNNaNNaNNaN41.92-87.6841.93-87.72casual11.03.4911
91AA3756A6F8189DBelectric_bike2022-08-07 23:53:052022-08-08 00:04:14NaNNaNNaNNaN41.93-87.7241.92-87.68casual11.03.4911

Last rows

ride_idrideable_typestarted_atended_atstart_station_namestart_station_idend_station_nameend_station_idstart_latstart_lngend_latend_lngmember_casualduration_in_minsdistance_km
60816316DEFEFE78E1277D1classic_bike2023-01-09 12:31:572023-01-09 12:44:30Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902973-87.63128041.920771-87.663712casual12.03.3345
60816320CB0D8048D672CE7electric_bike2023-01-04 17:30:212023-01-04 17:42:12Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902875-87.63178141.920771-87.663712casual11.03.3078
60816338D7369B7FE52D037electric_bike2023-01-20 18:21:262023-01-20 18:55:42Ogden Ave & Congress Pkwy13081Clarendon Ave & Gordon Ter1337941.874960-87.67323041.957867-87.649505member34.09.4254
6081634920FD3D051E63678electric_bike2023-01-17 18:39:412023-01-17 19:00:33Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902781-87.63166341.920771-87.663712casual20.03.3218
6081635A3DC3E8358DB1FAAelectric_bike2023-01-17 18:36:002023-01-17 19:00:26Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902760-87.63149141.920771-87.663712casual24.03.3347
6081636A303816F2E8A35A8electric_bike2023-01-11 17:46:232023-01-11 17:57:31Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902634-87.63159141.920771-87.663712casual11.03.3365
6081637BCDBB142CC610382classic_bike2023-01-30 15:08:102023-01-30 15:33:26Western Ave & Leland AveTA1307000140Clarendon Ave & Gordon Ter1337941.966400-87.68870441.957867-87.649505member25.03.3771
60816387D1C7CA80517183Bclassic_bike2023-01-06 19:34:502023-01-06 19:50:01Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902973-87.63128041.920771-87.663712casual15.03.3345
60816391A4EB636346DF527classic_bike2023-01-13 18:59:242023-01-13 19:14:44Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902973-87.63128041.920771-87.663712casual15.03.3345
6081640069971675AC7DC62electric_bike2023-01-02 13:48:292023-01-02 13:59:29Clark St & Elm StTA1307000039Southport Ave & Clybourn AveTA130900003041.902822-87.63168741.920771-87.663712casual11.03.3175